254 research outputs found
Nonbipartite Dulmage-Mendelsohn Decomposition for Berge Duality
The Dulmage-Mendelsohn decomposition is a classical canonical decomposition
in matching theory applicable for bipartite graphs, and is famous not only for
its application in the field of matrix computation, but also for providing a
prototypal structure in matroidal optimization theory. The Dulmage-Mendelsohn
decomposition is stated and proved using the two color classes, and therefore
generalizing this decomposition for nonbipartite graphs has been a difficult
task. In this paper, we obtain a new canonical decomposition that is a
generalization of the Dulmage-Mendelsohn decomposition for arbitrary graphs,
using a recently introduced tool in matching theory, the basilica
decomposition. Our result enables us to understand all known canonical
decompositions in a unified way. Furthermore, we apply our result to derive a
new theorem regarding barriers. The duality theorem for the maximum matching
problem is the celebrated Berge formula, in which dual optimizers are known as
barriers. Several results regarding maximal barriers have been derived by known
canonical decompositions, however no characterization has been known for
general graphs. In this paper, we provide a characterization of the family of
maximal barriers in general graphs, in which the known results are developed
and unified
A User-Friendly Hybrid Sparse Matrix Class in C++
When implementing functionality which requires sparse matrices, there are
numerous storage formats to choose from, each with advantages and
disadvantages. To achieve good performance, several formats may need to be used
in one program, requiring explicit selection and conversion between the
formats. This can be both tedious and error-prone, especially for non-expert
users. Motivated by this issue, we present a user-friendly sparse matrix class
for the C++ language, with a high-level application programming interface
deliberately similar to the widely used MATLAB language. The class internally
uses two main approaches to achieve efficient execution: (i) a hybrid storage
framework, which automatically and seamlessly switches between three underlying
storage formats (compressed sparse column, coordinate list, Red-Black tree)
depending on which format is best suited for specific operations, and (ii)
template-based meta-programming to automatically detect and optimise execution
of common expression patterns. To facilitate relatively quick conversion of
research code into production environments, the class and its associated
functions provide a suite of essential sparse linear algebra functionality
(eg., arithmetic operations, submatrix manipulation) as well as high-level
functions for sparse eigendecompositions and linear equation solvers. The
latter are achieved by providing easy-to-use abstractions of the low-level
ARPACK and SuperLU libraries. The source code is open and provided under the
permissive Apache 2.0 license, allowing unencumbered use in commercial
products
Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures
We study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of the coarse-grid solver on performance. We consider two algorithms for solving the coarse-space systems: the preconditioned conjugate gradient method and a new robust HSS-embedded low-rank sparse-factorization algorithm. Our test data comes from the SPE Comparative Solution Project for oil-reservoir simulations. We contrast the performance of our code on one 12-core socket of a Cray XC30 machine with performance on a 60-core Intel Xeon Phi coprocessor. To obtain top performance, we optimized the code to take full advantage of fine-grained parallelism and made it thread-friendly for high thread count. We also developed a bounds-and-bottlenecks performance model of the solver which we used to guide us through the optimization effort, and also carried out performance tuning in the solver’s large parameter space. As a result, significant speedups were obtained on both machines
Fast interior point solution of quadratic programming problems arising from PDE-constrained optimization
Interior point methods provide an attractive class of approaches for solving linear, quadratic and nonlinear programming problems, due to their excellent efficiency and wide applicability. In this paper, we consider PDE-constrained optimization problems with bound constraints on the state and control variables, and their representation on the discrete level as quadratic programming problems. To tackle complex problems and achieve high accuracy in the solution, one is required to solve matrix systems of huge scale resulting from Newton iteration, and hence fast and robust methods for these systems are required. We present preconditioned iterative techniques for solving a number of these problems using Krylov subspace methods, considering in what circumstances one may predict rapid convergence of the solvers in theory, as well as the solutions observed from practical computations
A Schur complement approach to preconditioning sparse linear least-squares problems with some dense rows
The effectiveness of sparse matrix techniques for directly solving large-scale linear least-squares problems is severely limited if the system matrix A has one or more nearly dense rows. In this paper, we partition the rows of A into sparse rows and dense rows (A s and A d ) and apply the Schur complement approach. A potential difficulty is that the reduced normal matrix AsTA s is often rank-deficient, even if A is of full rank. To overcome this, we propose explicitly removing null columns of A s and then employing a regularization parameter and using the resulting Cholesky factors as a preconditioner for an iterative solver applied to the symmetric indefinite reduced augmented system. We consider complete factorizations as well as incomplete Cholesky factorizations of the shifted reduced normal matrix. Numerical experiments are performed on a range of large least-squares problems arising from practical applications. These demonstrate the effectiveness of the proposed approach when combined with either a sparse parallel direct solver or a robust incomplete Cholesky factorization algorithm
Using the realized relationship matrix to disentangle confounding factors for the estimation of genetic variance components of complex traits
Background: In the analysis of complex traits, genetic effects can be confounded with non-genetic effects, especially when using full-sib families. Dominance and epistatic effects are typically confounded with additive genetic and non-genetic effects. This confounding may cause the estimated genetic variance components to be inaccurate and biased
Fast and accurate protein substructure searching with simulated annealing and GPUs
<p>Abstract</p> <p>Background</p> <p>Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate approaches are still required, and few current methods are capable of substructure (motif) searching.</p> <p>Results</p> <p>We developed an improved heuristic for tableau-based protein structure and substructure searching using simulated annealing, that is as fast or faster and comparable in accuracy, with some widely used existing methods. Furthermore, we created a parallel implementation on a modern graphics processing unit (GPU).</p> <p>Conclusions</p> <p>The GPU implementation achieves up to 34 times speedup over the CPU implementation of tableau-based structure search with simulated annealing, making it one of the fastest available methods. To the best of our knowledge, this is the first application of a GPU to the protein structural search problem.</p
- …